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Abstract 



The concept of data depth in non-parametric multivariate descriptive statis- 
tics is the generahzation of the univariate rank method to multivariate data. 
Halfspace depth is a measure of data depth. Given a set S of points and a 
point p, the halfspace depth (or rank) /c of p is defined as the minimum num- 
ber of points of S contained in any closed halfspace with p on its boundary. 
Computing halfspace depth is NP-hard, and it is equivalent to the Maximum 
Feasible Subsystem problem. In this thesis a mixed integer program is formu- 
lated with the big-M method for the halfspace depth problem. We suggest a 
branch and cut algorithm. In this algorithm, Chinneck's heuristic algorithm 
is used to find an upper bound and a related technique based on sensitivity 
analysis is used for branching. Irreducible Infeasible Subsystem (US) hitting 
set cuts are applied. We also suggest a binary search algorithm which may 
be more stable numerically. The algorithms are implemented with the BCP 
framework from the COIN-OR project. 
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Chapter 1 

The Halfspace Depth Problem 

1.1 Data Depth 

Halfspace depth is a measure of data depth. The term data depth comes from 
non-parametric multivariate descriptive statistics. Descriptive statistics is 
used to summarize a collection of data, for example by estimating the center 
of the data set. In non-parametric statistics, the probability distribution of 
the population is not considered, and the test statistics are usually based 
on the rank of the data. In multivariate data analysis, every data item 
consists of several elements (i.e. is an n-tuple). The idea of data depth 
in multivariate data analysis is to generalize the univariate rank method to 
tuple data, and order the data in a center-outward fashion. Since the tuple 
data items can be represented as points in Euclidean space M*^, these two 
terms are used interchangeably in this thesis. The rank or depth of a point 
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measures the centrality of this point with respect to a given set of points in 
high dimensional space. The data with the highest rank is considered the 
center or median of the data set, which best describes the data set. 

In R^, the median holds the properties of high breakdown point, affine 
equivariance, and monotonicity. Breakdown point of a measure is the fraction 
of the input that must be moved to infinity before the median moves to 
infinity. In R^, the median has a breakdown point of | [3]. After an affine 
transformation on the data set, the median will not be changed. Therefore, 
median is affine equivariant. When extra data is added to one side of the 
data set, the median tends to move to that side, never moving to the opposite 
side. This property is called monotonicity. A good measure of data depth 
should also hold these properties, ideally, to the same degree as median does. 

Many measures of data depth have been introduced, such as halfspace 
depth [24, 53], convex hull peeling depth [6, 51], Oja depth [39], simplicial 
depth [30], majority depth [52], regression depth [48], and so on. The sur- 
veys [3, 20, 31, 42] give detailed introductions to these measures. 

1.2 Halfspace Depth 

The halfspace depth is also called Tukey depth. Given a set S of points and 
a point p in R*^, the halfspace depth of p is defined as the minimum number 
of points of S contained in any closed halfspace with p on its boundary. The 
point with the largest depth is called halfspace median or Tukey median. 
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Figure 1.1: An example of halfspace depth in 
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Figure 1.2: Another example of halfspace depth in 



In Figure 1.1, the halfspace depth of p is 3, because at least three points 



will be contained by the closed halfspace with p on its boundary. And the 



depth of point p in Figure 1.2 is 0. 



3 



The halfspace depth of point p can also be described as: 



min \{q G S\{x,q) < {x,p)}\ 



(1.1) 



where x is the outward normal vector of the closed halfspace. Prom Fig- 
ure 1.1, we can see that if a point q is contained in the closed halfspace, the 
corresponding inequality {x,q) < {x,p) will be satisfied. So we are trying 
to find an x which can minimize the number of satisfied inequalities. Mini- 
mizing the number of the points contained in the halfspace is equivalent to 
maximizing the number of points excluded from the halfspace. Therefore, 
the definition of halfspace depth can also be described as: 



When a point is excluded from the halfspace, the corresponding inequality 
in (1.2) is satisfied. Then the problem is to find a vector x that maximizes 
the number of satisfied inequalities. 

A data set is said to be in general position if it has no ties, no more than 
two points on the same line, no more than three points on the same plane 
and so forth. If the data set is in general position, computing the halfspace 
depth of a point is identical to the open hemisphere problem introduced by 
Johnson and Preparata. Given a set of n points on the unit sphere S''' in 
R*^, the open hemisphere problem is to find an open hemisphere of S'^ that 
contains as many points as possible. This problem is NP-complete if both n 



\S\ -max\{q G S\{x,q) > {x,p)}\ 
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and d are parts of the input [25] . 

1.2.1 Properties of Halfspace Depth 

If the data set is in general position, the depth of the halfspace median is 
in the range of — 1 and [|] — 1 [18] (our depth values differ by one 

from the ones in [18]). The halfspace median might not be unique, but the 
measure of halfspace depth is preferred by statisticians compared with other 
measures because it has the following properties: 

High Breakdown Point In M'^, the breakdown point halfspace depth is at 
least ^ and can be as high as | when d is greater than 2 [18, 19]. 

AfRne Equi variance After an affine transformation of the data set, the 
rank value of any data item will not be changed. 

Monotonicity The halfspace median tends to move to the location where 
data is added. 

Convex and Nested The boundary of the set of data with depth at least 
k is called the contour of depth k [18] (see Figure 1.3). The contours 
are all convex. The contours are also nested. The contour of depth 
/c + 1 is completely contained by the contour of depth k [18]. 
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depth 




Figure 1.3: Halfspace depth contours 

1.3 Overview of This Thesis 

In this chapter we introduced the definition of the lialfspace depth and some 
basic properties. In Chapter 2 we show that the halfspace depth problem 
is equivalent to the maximum feasible subsystem (MAX FS) problem. In 
Chapter 3 we discuss different integer problem formulations for the halfspace 
depth problem. In Chapter 4 we introduce the heuristic algorithm developed 
by Chinneck for the MAX FS problem. In Chapter 5 we introduce the 
branch and cut method for solving general integer programs. In Chapter 6 
we introduce our branch and cut algorithm for the halfspace depth problem. 
We also introduced a binary search strategy in this chapter, due to the fact 
that we can not check the accuracy of the results. In Chapter 7 we introduce 
the details of the implementation of our algorithm, which is implemented 
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with the BCP framework. BCP is also briefly introduced in this chapter. In 
Chapter 8 we give some testing results and benchmark the performance of 
our algorithm. In Chapter 9 we summarize the work in this thesis, and give 
some conclusions. 

Throughout this thesis we assume that the reader is familiar with linear 
programming, integer programming, and combinatorial optimization. For 
linear programming, we refer to the books by Chvatal [15] , Hillier and Lieber- 
man [23]. 
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Chapter 2 



Maximum Feasible Subsystem 

2.1 Introduction 

The halfspace depth problem has a strong connection with the maximum 
feasible subsystem problem. If a linear system has no solution, we say this 
system is infeasible. Given an infeasible linear system, the MAX FS problem 
is to find a maximum cardinality feasible subsystem. This problem is NP- 
hard [11, 49], and it is also hard to approximate [4]. Pfetsch shows several 
applications of MAX FS in [41], for example, linear programming, telecom- 
munications, and machine learning. 

When point p is contained in the convex hull of S, and p is on the bound- 
ary of a closed halfspace, as shown in Figure 1.1, there must be some data 
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contained by the halfspace. Then the set of inequahties 

{x,q)>{x,p) yqeS (2.1) 

or 

{x,q-p)>0 yqeS (2.2) 

in (1.2) can not be satisfied at the same time, in other words, (2.2) is an 
infeasible hnear system. To compute the halfspace depth of point p is to find 
the maximum number of inequahties in (2.2) that can be satisfied at the same 
time, or say to find the maximum feasible subsystem of (2.2). Therefore, the 
halfspace depth problem is a MAX FS problem. Of course, if p is outside of 
the convex hull of S, as s, (2.2) will be feasible, and the depth for p will be 
0. 

The MAX FS problem can also be seen as finding a minimum cardinality 
set of constraints, whose removal makes the original infeasible system feasible. 
This problem is called the minimum unsatisfied linear relation (MIN ULR) 
problem. 

2.2 Irreducible Infeasible Subsystems 

In an infeasible linear system, an irreducible infeasible subsystem (US) is a 
subset of constraints that itself is infeasible, but any proper subsystem is 
feasible. If a subset of points A oi S forms a simplex which contains p, the 
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inequalities in (2.2) defined by A form an IIS. For example, in Figure 2.1 
three points form a simplex which contains point p, so they can not be 
simultaneously excluded from any closed halfspace with boundary through p. 
Then the corresponding inequalities form an infeasible system. The system is 
irreducible because if any point is removed, the other two can be excluded at 
the same time. The point set ^ is a minimal dominating set (MDS), which 
is is a set of points forming a minimal convex hull that contains p [9]. A 
degenerate MDS is shown in Figure 2.2, where the three points are coUinear. 

n 

/ \ 
/ \ 

P \ 

/ \ 

' • N 

' \ 
' \ 
^ \ 
~ - - \ 

\ 

~ - -. ^ \ 

Figure 2.1: An MDS in 

Every infeasible system contains one or more IISs. To make the original 
system feasible, we need to delete at least one inequality from every IIS, in 
other words, we need to delete a hitting set of all IISs in the infeasible system. 
The minimum- cardinality IIS set- covering (MIN IIS COVER) problem is to 
find the smallest cardinality set of constraints to hit all IISs of the original 
system (this problem is a minimum hitting set problem, although it is called 
a set cover problem in [14, 40]). The MIN IIS COVER set (hitting set) is 
the smallest set of constraints whose removal makes the original infeasible 
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Figure 2.2: A degenerate MDS in 

system feasible. Hence, the MIN IIS COVER problem is identical to the 
MIN ULR problem, and hence the MAX FS problem. 

Parker gives a method for the MAX FS problem in [40], and Pfetsch 
further develops this method in [41]. Due to the fact that the infeasible 
system could contain an exponential number of IISs with respect to the 
number of constraints and the number of variables [11], the main idea of 
Parker's method is finding a subset of IISs in the whole problem and solving 
an integer program to find a minimum hitting set in each iteration. If the 
hitting set hits all IISs in the original infeasible system, the optimum solution 
is found. If not, find some IISs that are not hit by the current hitting set, 
then find (with an integer program) a new minimum hitting set that also hits 
the new IISs. 

An important part of this method is finding IISs. Given a linear system 
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Ax > b, where A e R"*''" and b e W^, the following polyhedron: 

P = {ye M"^|/A = 0, y^b = 1, 2/ > 0} (2.3) 

is defined as the alternative polyhedron. Each vertex of P corresponds to an 
IIS in the original infeasible system [21, 27, 40, 41]. More precisely, the set 
of non-zero supports of a vertex corresponds to an IIS. 

In this chapter we introduced the maximum feasible subsystem problem 
and the irreducible infeasible subsystem, demonstrated that it is equivalent 
to the halfspace depth problem. In the next chapter we will explore the 
mixed integer program modeling of the halfspace depth problem. 
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Chapter 3 

Mixed Integer Program (MIP) 
Formulation 

Parker suggests two integer program formulations for the MIN IIS COVER 
problem in [40]. One is applying the big-M method (see [40] and [41]) to 
the inequalities in the infeasible system, and the other is based on the IIS 
inequalities. In this chapter, we will introduce these MIP formulations. 

3.1 The Infeasible System 

Suppose we have a group of data {Ai,A2, . . . ,An} and a point Ap in Eu- 
clidean space R'^, and x is the normal vector of the halfspace that defines 
the halfspace depth of Ap. Finding the halfspace depth of Ap is equivalent 
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to finding the MIN IIS COVER T of the following system: 



d 



A'pjXi > 



1=1 



d 



Al)xi > 



(3.1) 



1=1 



d 



n 



J^p)xi > 



i=l 



The depth of ^4^ is |r|. 

3.2 Parker's Formulation 

Parker reports that the integer program formulated with the big-M method 
is hard to solve when the problem size is large. In [40], Parker deals with 
the MIN IIS COVER problem with an integer program formulated using the 
IIS inequalities. First of all, let us introduce the IIS inequalities. MIN IIS 
COVER is a minimum hitting set problem, and the hitting set has at least 
one constraint in common with every IIS in the infeasible system. For an IIS 
C in (3.1), we can use the binary variables associated with the constraints in 
C to formulate an inequality like 




(3.2) 



tec 
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where Sf is the binary variable associated with constraint t in (3.1). 

Using the IIS inequalities, a hitting set integer program is formulated in 
the following form: 



n 



minimize 




i=l 



subject to J^Si > 1 VC (IIS of system 3.1) (3.3) 

iec 

Si e {0,1} e {l,2,...,n} 

As we mentioned in Section 2.2, Parker's strategy is to first find a small 
set of IISs and formulate an integer program (a sub- program of (3.3)). After 
obtaining the optimum solution to the initial integer program, find some llSs 
that are not hit by the solution, add the corresponding IIS inequalities into 
the integer program and resolve it. The process stops when the solution hits 
all IISs in the infeasible system. 

3.3 An MIP with the Big-M Method 

Instead of using the hitting set integer program, we treat the halfspace depth 
problem with the big-M method. To formulate an integer program for the 
halfspace depth problem, the strict inequalities in system (3.1) need to be 
transformed into non-strict ones. From (3.1), we can derive the following 
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possibly infeasible system: 

d 

1=1 

d 

J2i^l-Al)x, > e (3.4) 

i=l 

d 
i=l 

where e is a small positive real number. We can get rid of the e in (3.4) 
by dividing both sides of the inequalities by the e. Then we will have the 
following system: 

d 

E(4-4)x, > 1 

i=l 
d 

Y,{A-K>i > 1 (3.5) 

i=l 

d 
i=l 

Because the elements of x are variables, the elements of | are still variables. 
Hence, the left hand sides of (3.5) are the same as (3.4). For the half- 
space depth problem, we formulate a mixed integer program with the big-M 
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method as follows: 



1 

1 (3.6) 
1 

{0,1} Vie{l,2,...,n} 

< +00 Vi e {1,2, 

Fixing the binary variable Sj to 1 has the effect of removing constraint j 
from (3.1). The objective function is to minimize the number of constraints 
that have to be removed for finding a feasible subsystem of (3.1). For the 
general MIN IIS COVER problems, the big-M method may not be practical. 
As Parker and Pfetsch mentioned the big-M should be big enough to make 
the infeasible system feasible, but if it is too big, it will bring numerical 
problems (see [40] for details). On the other hand Pfetsch notes that this 
method works reasonably well in the digital broadcasting application [47] . In 
this thesis we investigate the big-M method for the halfspace method. 

In this formulation, it is easy to find a value for M to make (3.6) feasible, 
but the value of M should be large enough to guarantee an accurate result. 



n 

minimize Sj 

d 



subject to - + siM > 

d 

5](A'2-4)X, + S2M > 



1=1 



i=l 



> 



Sj e 



— OO < Xi 
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It is easy to see that if M is assigned to 1, (3.6) will be feasible, but the 
optimal solution will not be the MIN IIS COVER of (3.5) because all the 
binary variables will be forced to 1. Let Xg be the value of x in the optimum 
solution of (3.6), and which is a MIN IIS COVER of (3.5). For some point 
At in the input data set, {Xo, At) could be a large negative number, which 
would require the value of M to be very large. 



3.4 An Alternative MIP 

Because of the difficulty of finding a proper value for M in the big-M method, 
we can keep the e. Using the big-M method directly on (3.4), we can formu- 
late the following mixed integer program: 

n 

minimize Sj 

d 

subject to '^{A - + siM > e 

1=1 

d 

J2{Al-Al)xi + S2M > e (3.7) 



1=1 



i=l 



Sj e {0,1} Vje{l,2,...,n} 
— oo < Xi < -l-oo Vi e {1, 2, . . . , d} 
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In this formulation, the e should be small enough to guarantee an accurate 
result of halfspace depth. At the same time, the M should also be big enough. 
Prom the definition of inner product, we can get the following observation: 

X • q — \\x\\ ■ \\q\\ • coscK < \\x\\ • \\q\\ (3.8) 

Now we can give a bound — c < < c for each element of vector x, where c 
is a constant number. Suppose point q^ax is the point with the largest norm 
value in the data set. Then the M can be set to the value of \Jd- (? ■ \\qmax\\- 
When computing the halfspace depth, maximizing the number of points 
contained in an open halfspace is the same as maximizing the number of 
points contained in a cone. For instance, if the open halfspace in Figure 3.1 
is replaced by a cone (see Figure 3.2), we can still have the same depth value 
for point p. 

Suppose the angle between the boundary of the cone and the halfspace is 
Q. Without loss of generality, we can take p as the origin of the space. Let x 
be an outward normal vector of the halfspace, g be a point contained by the 
cone, and a be the angle between x and q (which is a point in the data set, 
thus, it is a vector in the space). The definition of the inner product tells us 

TT 

X ■ q — ||x|| • • cosa = ||x|| • ||gf|| • sin(— — a)> ||x|| • • sin^ (3.9) 

where the last inequality is illustrated in Figure 3.2. Suppose point qmm is 
the point with the smallest norm value in the data set; then, no matter what 
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direction x points to in the optimal solution of (3.7), there is always an x 
which makes the following inequality satisfied for any point q. 



X ■ q > \\x 




■ sin 6 



(3.10) 



Now the question is how to find the 9 when deciding the e for (3.7). With 



In R^, let C be a circle centered on p. A point q defines an arc on 
C such that every point on this arc defines an x which corresponds to an 
open halfspace that contains q, where x is the outward normal vector of the 
corresponding halfspace. The arcs are actually half circles. The halfspace 
depth problem then can be viewed as finding a point which is contained in the 
largest number of arcs [8]. The optimal solutions will form an arc (solution 
arc) intersected by the largest number of half circles. Prom Figure 3.3 we can 
see that the points excluded from the solution halfspace must be contained in 
the cone. Suppose the angle corresponding the solution arc is f3, then f3 = 26. 
Therefore, the smallest angle 7 between the lines gives a lower bound of 29. 

In R^, we can replace the half circles with half spheres. Then the halfspace 
depth problem can be viewed as finding a point contained in the largest 
number of spheres. The smallest inscribed cone with p as its apex in the 
intersections of the half balls that correspond to the spheres will define the 
lower bound of 9. The lower bound is half of the opening angle of the cone. 
In higher dimensional space the situation will become more complex. We 



a proper 9, e can be set to the value of \J d- (? • Hg, 




• sin 9. 
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The cone defines the optimal solution 

Figure 3.3: Intersections of the arcs 

have not got a good idea so far for computing the lower bound. 

The value of e can also be bounded by the method suggested by David 
Bremner and Achill Schiirmann. In this new method all the data are con- 
sidered as integral data (fractional data can be scaled up to integral data). 
Then the input data set S will be a subset of the integer lattice [22]. The 
following theorem is given by Achill Schiirmann. 

Theorem 3.4.1. Suppose points {Xi,X2, . . . affinely independent 

in R'^. For any point Xi (Xi e := {X e R*^ : \X^\ < mj ^ 1,2, . . . ,d}). 
Let H be an affine combination of {Xi, X2, . . . , X^}, and H does not contain 
the origin O. Then we can have the following statement for the distance from 



22 



O to H, 
Proof. Let I :— 



dist(ii',0) > (2m\/d)-('^-^) 



(3.11) 



+ Z(X2 -X,) + ... + Z{Xa - X,) 

be a lattice of within H. Let Iq :— H f] Z'^, then we have I C l^. The 
distance /i of if to a parallel plane containing lattice points is = 
Then dist(ii',0) > h > Since ^ e N, we have detZ > det^o- 

Therefore, we have dist(if, 0) > According to Hadamard's inequality, 

we have detl < Y[t=2\\^i Since — < diameter(Cm) = 2mVd, 

therefore, dist {H, 0) > {2mVd)-^'^-^l □ 

Let us now return back to the idea of halfspace depth defined by a cone 
(see (3.7)). As shown in Figure 3.4, a distance h defines a cone. Point p 
corresponds to the origin O in the former paragraph. The value of sin 9 will 

be ^ 

radius(C) 

When the dimension is high, such as 20, the value of e based on this 
lattice idea would be too small to be useful in practice. In our testing, we 
just set e to a very small value. If e is not small enough, it will have the same 
effect as M not being big enough. Unfortunately, we did not find a way to 
test whether a solution is accurate. 

In this chapter we introduced Parker's MIP formulations and formulated 
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Figure 3.4: Lattice 

an MIP with the big-M method for the halfspace depth problem. In the next 
chapter we will introduce a heuristic algorithm for the MAX FS problem, 
which can be used for the halfspace depth problem. 
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Chapter 4 



A Heuristic Algorithm 



4.1 Elastic Programming 



Chinneck [12, 14] suggests a heuristic algorithm for the MIN IIS COVER 
problem. As discussed in Chapter 2, this is also an algorithm for the half- 
space depth problem. This algorithm is based on several observations of 
elastic programming (a method to solve an integer program [10] according to 
Chinneck). In elastic programming, every constraint is elasticized by adding 
a non-negative elastic variable. Chinneck gives the following rules: 
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In fully elastic programming the bounds of the variables are also elasticized 
in following ways: 




"^3 "I" ^3 — h 



Xj Cj < Uj 



The elastic objective function is to minimize the sum of the elastic variables, 
which is similar to phase 1 of the two phase simplex method [15]. After 
elasticizing, the original infeasible system becomes feasible, and the optimum 
solution will give some information about the infeasibility in the original 
system. This elastic programming is also similar to the big-M method. In 
the big-M method, a set of binary variables with a large coefficient are used 
to make the infeasible system feasible. When the optimum point of the elastic 
program is reached, the optimum value of the objective function is called the 
sum of the infeasibility (SINF). A nonzero elastic variable indicates a violated 
constraint in the original model, and the number of the nonzero variables is 
called the number of infeasibility (NINF). As Chinneck observed, the MIN 
IIS COVER problem is the problem to minimize NINF. At the optimum 
point, the value of an elastic variable is called the constraint violation of the 
corresponding constraint in the original model. The reduced cost of the slack 
or surplus variable is called the constraint sensitivity of the corresponding 
constraint, which, in fact, is the shadow price of the corresponding constraint. 
The shadow price of a constraint indicates how much the objective value of 
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the optimum solution will be changed by changing the right hand side of the 
constraint by one unit. For more details about elastic programming, please 
refer to [13]. 

4.2 The Heuristic Algorithm 

The heuristic algorithm developed by Chinneck in [12] is based on the fol- 
lowing four observations of elastic programming. 

Observation 1 When the elastic program terminates, the constraints asso- 
ciated with non-zero elastic variables form an IIS hitting set. 

Observation 2 When the elastic program terminates, if NINF is 1, the 
constraint with a non-zero elastic variable forms the MIN IIS COVER. 

Observation 3 The SINF will be reduced more by eliminating a constraint 
in the MIN IIS COVER. 

Observation 4 Removing a constraint to which the objective function does 
not sensitive will not reduce SINF. 

Detailed explanations of these observations are available in [12, 13]. Based 
on these observations, a heuristic algorithm is given in [13] as follows: 

Step 1 Solve an elastic program of the original infeasible system. If the NINF 
is 1, the hitting set is found due to Observation 2. If the NINF is 

greater than 1, select the set of constraints with non-zero constraint 
violation as candidate constraints. 
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Step 2 For each of these candidate constraints, delete it temporarily and re- 
solve the elastic program and record the corresponding SINF and NINF 
for this constraint. 

Step 3 The constraint with the minimum SINF is a member of the output 
IIS COVER. Delete this constraint permanently. If the corresponding 
NINF of this constraint is 1, the violated constraint is also a member of 
the output IIS COVER, and the algorithm terminates. If the NINF is 
greater than 1, select candidate constraints with the criteria in Step 1, 
and go to Step 2. 

This heuristic may be slow especially when the problem size is big, be- 
cause in each step wc need to solve a linear program for each candidate 
constraint. Chinneck revised this heuristic in [14] to speed up the algorithm. 
The revision is based on the following two observations: 

Observation 5 For a constraint with constraint violation in the original 
model, the relative size of the drop in SINF can be estimated by (con- 
straint violation) x | (constraint sensitivity) | . 

Observation 6 For an constraint with zero constraint violation, the relative 
size of the drop can be estimated by | (constraint sensitivity) | . 

Based on these two observations, Chinneck gives a new criteria for selecting 
candidate constraints for Step 2. In the new criteria, the constraints with 
constraint violation are sorted according to the value of (constraint violation) 
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X I (constraint sensitivity) | in decreasing order, and the first k constraints are 
selected as candidate constraints; the constraints with zero constraint viola- 
tion are sorted according to value of | (constraint sensitivity) | , and the first k 
constraints are also used as candidate constraints. In practice k can be set 
to 1. With the new criteria, fewer candidate constraints will be considered, 
so the algorithm will be faster although it could be less accurate (neither 
version has accuracy guarantees). 

In this chapter we introduced elastic programming and Chinneck's heuris- 
tic algorithm for the MAX FS problem. In the next chapter we will introduce 
the branch and cut method for solving general mixed integer programs. 



29 



Chapter 5 



The Branch and Cut Paradigm 

5.1 The Branch and Bound Method 
5.1.1 Introduction 

The branch and bound method is an approach for solving discrete and combi- 
natorial optimization problems. Many of these problems can be modeled as 
integer linear programming problems. An integer linear programming prob- 
lem is defined by a linear objective function and a set of constraints (linear 
equalities or inequalities). In addition, some or all variables are restricted 
to integer values. Correspondingly, the problems are called pure or mixed 
integer linear programming problems. Any solution that satisfies all these 
constraints is called a feasible solution. The one that maximizes or minimizes 
the objective function is called the optimum solution. To find the optimum 
solution, all the feasible solutions need to be enumerated and compared, be- 
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cause there are no better ways known for checking whether a given feasible 
solution is optimum. Unfortunately, the time complexity of the enumerating 
algorithm will grow exponentially as the problem size increases. Therefore, 
it is not practical to enumerate all the feasible solutions when the problem is 
large. The branch and bound method was developed to reduce the number of 
feasible solutions to test. To simplify this discussion, we explain the branch 
and bound method on pure integer linear programming problems. It will be 
obvious at the end that the branch and bound method will work on mixed 
integer linear programming problems too. 

The branch and bound method was developed independently by A.H. 
Land and A.G. Doig in 1960 and by K.G. Murty, C. Karel, and J.D.C. 
Little in 1962 [37]. The branch and bound method is a divide and conquer 
paradigm. If the original problem is too hard to solve directly, we divide the 
problem into smaller size subproblems. If any subproblem is still too hard 
to solve, we will further divide the problem until we can solve them. The 
branch and bound method manages a problem tree. The original problem 
is the root of this tree, and the children of a node are the subproblems of 
the problem associated with the node. This problem tree is called the search 
tree. 

5.1.2 The Branch and Bound Method 

The basic idea of the branch and bound method involves: 
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Branching Choosing and breaking a problem into some small subproblems. 

Bounding Computing the lower (or upper) bounds of the subproblems. 

Pruning Eliminating those subproblems which are not needed for further 
consideration due to the bounds. 

The branch and bound method gives us a way of enumerating the feasible 
solutions implicitly i.e., by partial enumeration. At any point of the opti- 
mization process, the status of the algorithm is defined by the current best 
feasible solution and the unexplored space of the feasible solutions. We as- 
sume the original problem is a minimization problem (a maximization prob- 
lem can be easily transformed into a minimization problem). The following 
gives a detailed description about the three basic steps. 

Branching 

In the branching step, some additional constraints are added into the original 
set of constraints. With the new constraints, we get a set of new subprob- 
lems which are called candidate problems for further consideration. Every 
candidate problem has a (possibly empty) set of feasible solutions. In other 
words, the set of original feasible solutions is divided into disjoint subsets 
(whose union is the original set of feasible solutions) . 

Suppose the feasible region is defined by the polyhedron in Figure 5.1. 
After adding a; < 3 to original problem, we get the subproblem on the left 
hand side in Figure 5.2. After adding x > 4, we get the subproblem on the 
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Figure 5.1: Before branching 



o o o 




Figure 5.2: After branching 

right hand side in Figure 5.2. Notice that any integer solution must satisfy 
one of these constraints. The variable x is called the branching variable. 

Bounding 

In the bounding step, for a given subproblem, both the lower and upper 
bounds of the optimum objective value (the result of the objective function) 
are computed and used. The upper bound bounds the optimum objective 
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value from above, which means that the optimum objective value will not 
be greater than the upper bound. The upper bound is a global bound, and 
it bounds every subproblem in the search tree. Therefore, only one upper 
bound is kept in the whole optimization process. When a smaller upper 
bound is found, the upper bound for the algorithm will be updated to this 
smaller value. The solution for the current upper bound is called the current 
incumbent which is the current best feasible solution of the problem. We 
can simply set the upper bound to positive infinity at the beginning of the 
algorithm, and when the optimum objective value of a candidate problem is 
found, updating the upper bound and the incumbent. Another strategy is 
finding a feasible solution with a heuristic algorithm at the beginning, and 
setting the upper bound to the objective value of this solution. 

On the other side, the lower bound bounds the optimum objective value 
from below, which means that the optimum objective value will not be 
smaller than the lower bound. Every candidate problem has its own lower 
bound which is no larger than any objective value of its feasible solutions. 
The lower bound is a local bound, and every subproblem has its own lower 
bound. When computing the lower bound, we hope that the lower bound is 
as close to the optimum objective value as possible, and that we spend as 
little effort as possible [37]. One strategy for computing the lower bound is 
solving a relaxed problem. We can simply remove or relax some constraints 
of the original hard problem, and then we get a relaxed problem which can 
be solved with an efficient algorithm. Because the relaxed problem has fewer 
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or looser constraints than the original problem, the set of feasible solutions 
for the original problem is a subset of the set of feasible solutions of the 
relaxed problem. Thus the minimum objective value of the relaxed problem 
must be smaller than or equal to that of the original problem. Therefore, 
the minimum objective value of the relaxed problem is usually used as the 
lower bound of the original problem. Linear programming (LP) relaxation is 
the most widely used relaxation. In the LP relaxation, the constraints that 
restrict variables to integer values are removed. 

Pruning 

In the pruning step, we can prune off a subproblem from the search tree 
under the following three cases. 

Case 1 If the solution of the relaxation satisfies all the constraints of the can- 
didate problem, then this solution is a feasible solution of the candidate 
problem; thus, this solution is the optimum solution of the candidate 
problem. If this happens, we say that this candidate problem is fath- 
omed. The lower bound and its solution will then be used to update 
the upper bound and the incumbent, and this candidate problem will 
be pruned off. This is the base case of the divide and conquer strategy. 

Case 2 If the lower bound is bigger than the current upper bound, that 
candidate problem will be pruned off, because any objective value of 
this candidate problem will be bigger than the upper bound. 
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Case 3 If a candidate problem has no feasible solution, it will also be pruned 
off, since any further restrictions (via branching) will also be infeasible. 

Case 3 is trivial; Case 1 and Case 2 are illustrated in Figure 5.3 and Figure 5.4. 
In Figure 5.3, the arrow is the direction of the optimization. The subproblem 




Figure 5.3: An example of fathom 



at the upper right corner will be fathomed after solving an LP relaxation. In 




Figure 5.4: An example of high lower bound 



Figure 5.4, suppose the current upper bound is defined by a feasible solution 
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of the subproblem at the lower left corner (as labeled). After solving an LP 
relaxation for the subproblem at the lower right corner, we will get a non- 
integral solution. The objective value of the solution will be greater than the 
current upper bound. 

If a subproblem can not be pruned off, we need to branch on that sub- 
problem. 

A branch and bound algorithm will keep iterating these three steps, and 
terminate when no candidate problems are available. Figure 5.5 is a typ- 
ical search tree of a branch and bound algorithm. After all the candidate 
problems are considered, the last incumbent is the optimum solution of the 
original problem. In this method, not all the feasible solutions are enumer- 
ated, but the complete space of the feasible solutions is searched and the 
exact optimum solution is found. Thus, the branch and bound method is 
not a heuristic method. 

5.1.3 Strategies in the Branch and Bound Method 

The framework of the branch and bound method is very flexible. It is just 
a general method and does not specify the details in any of the three steps. 
Thus different techniques can be applied to each step. 

Branching techniques 

A branching method is given in the above section. If a; is a binary variable, 
like Si in (3.7) which can only be assigned the value of or 1, we can have 
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Figure 5.5: An example of search tree 

two candidate problems: one with the additional constraint oi x = 0, the 
other with x — 1. In both of these two techniques, a problem is divided 
into two subproblems (so-called dichotomic branching). The search tree will 
be a binary tree, like Figure 5.5. A problem can also be divided into more 
subproblems (so-called polytomic branching); the resulting search tree will 
be a multiway tree. For more details of the branching techniques, please refer 
to [16, 28, 37]. 

Branching Vciriable selection 

In the branching step, it is important to carefully select the branching vari- 
able. Several methods are introduced in [28]. Murty [37] suggests that if 
several such variables are available, we usually choose the one that will pro- 
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duce the highest lower bound. The reason is that this strategy can reduce the 
gap between the upper and lower bounds, and thus can increase the chance 
of finishing the algorithm earlier. 

Bounding techniques 

In the bounding step, we usually solve a relaxation for the lower bound. 
There are also other types of relaxation, for instance, Lagrangian relaxation 
(see [29] for details). A good upper bound at the beginning of the algorithm 
can help prune more subproblems. Different heuristic algorithms are avail- 
able. It is also important to find a good heuristic algorithm for a specific 
problem. 

Candidate problem selection 

One strategy for choosing the candidate problem is to choose the one which 
has the least lower bound because this candidate problem has the greatest 
chance that its optimum objective value is smaller than any lower bound of 
the other candidate problems. This strategy is called the best first search 
strategy. Breadth first search strategy and depth first search strategy are 
also introduced in [16] . 

There are many strategies available in every step other than the above 
mentioned ones. The choices depend on the characteristics of a specific prob- 
lem. The performance of an algorithm depends on having good lower and 
upper bounds. It is better that the lower and upper bounds are close to the 
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optimum objective value. The tighter the bounds are, the more we can prune 
off; but computing tighter bounds usually means more computational effort. 

5.1.4 An Algorithm Prototype 

The branch and bound method is only an algorithm skeleton, and has to be 
filled out for each specific problem. An algorithm prototype can be stated as 
follows. 

(1) Initialization. Solve a relaxation of the original problem to compute 
the lower bound of the optimum objective value. If the solution of the 
lower bound satisfies all the constraints of the original problem, the 
optimum solution of the original problem is found, and the algorithm 
terminates. If there is no feasible solution for the relaxed problem, 
there is no feasible solution for the original problem. If neither of these 
cases happens, find a feasible solution for the original problem with 
a heuristic algorithm and set the upper bound and the incumbent, or 
just set the upper bound to positive infinity. Finally, initialize an empty 
tree, and let the original problem be the root. 

(2) Problem Selection. If the tree is empty, the algorithm terminates. If 
there is an incumbent, it is the optimum solution of the original prob- 
lem. If not, the original problem is infeasible. If the tree is not empty, 
select and remove a candidate problem from the tree with a selection 
rule. 
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(3) Branching. Divide the selected candidate problem into a set of new 
candidate problems with a branching rule. The new candidate problems 
are the children of the original problem in the search tree. 

(4) Bounding and Pruning. For each new candidate problem generated in 
step (3), compute the lower bound. If the candidate problem satisfies 
any pruning criteria in Section 5.1.2, discard it. If this candidate prob- 
lem is fathomed, then update the upper bound and the incumbent with 
the value and the solution of this lower bound. After the upper bound 
is updated, discard any candidate problem with a lower bound which is 
greater than the current upper bound. If this candidate problem is not 
fathomed, put it into the tree. After processing all the new candidate 
problems, go to step (2). 

The order of these steps can vary, and the strategies in every step can 
be different from the ones in this prototype. Two typical branch and bound 
algorithms, eager and lazy branch and bound, are described in [16]. 

5.1.5 More about Branch and Bound 

To apply the branch and bound method, one needs to develop a specific 
algorithm for a specific problem. An algorithm can have good performance 
on one problem, but it can have poor performance on another. For a large 
scale discrete and combinatorial optimization problem, finding a good feasible 
solution for the upper bound at the beginning of the algorithm is a "key 
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issue" [16]. The branch and bound method does not reduce the theoretical 
time complexity of the original problem. In the worst case, the search tree 
contains every feasible solution as a leaf. For a large scale problem, the 
computation load is usually too heavy for a single processor computer. Thus, 
parallel computers are usually employed for large scale problems, and, for this 
purpose, many parallel branch and bound algorithms have been developed. 

5.2 The Cutting Plane Method 

The cutting plane method is an approach to improve the non-integral solution 
of the LP relaxation of an integer linear programming problem. After solving 
an LP relaxation, the optimal solution may not be integral. The cutting 
planes (or cuts) are the constraints which are satisfied by the original integer 
programming problem, but violated by the non-integral solution. We can find 
some such cutting planes and add them to the original problem to reduce 
the feasible region of the LP relaxation; ideally, the solution of the new 
LP relaxation will be closer to the integral optimal solution (as shown in 
Figure 5.6). The cutting plane algorithms will keep repeating the process 
of adding cuts and solving the linear relaxation until the integral optimal 
solution is found. As shown in Figure 5.7, if some cutting planes intersect on 
the integral optimal solution of the integer program and define an optimal 
vertex, the integer program will be solved. 

In fact, it is hard to find the optimum integral solution by the cutting 
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Figure 5.6: An example of a cutting plane 
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Figure 5.7: An example of cutting planes intersecting on the optimal solution 



plane method itself, although it has solved some problems successfully. Sev- 
eral types of cutting planes for general integer programs have been proposed, 
for example, Chvatal-Gomory cuts, knapsack cuts, and lift-and-project cuts. 
For more details about the cutting plane method, please refer to [34, 38]. 
These general cutting planes may not work well for some problems. Some 
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problem-specific cuts can be developed for a specific problem. Recall that 
the halfspace depth problem is a hitting set problem. The optimal solution 
of (3.7) is a hitting set of all IISs in (3.4). For an IIS C, we can formulate 
an IIS inequality (3.2). We can use such constraints as cuts for (3.7); this 
reduces the problem of finding cuts to that of finding IISs. As mentioned in 
Section 2.2, an IIS corresponds to a vertex of the infeasible system's alterna- 
tive polyhedron. [27] shows that generating all IISs of an infeasible system 
is NP-hard. 

5.3 The Branch and Cut Method 

In a branch and bound algorithm, we can apply the cutting plane method 
to every node in the search tree. Then, the branch and bound approach can 
be sped up dramatically. The combination of the branch and bound and the 
cutting plane method is called the branch and cut method. For more details 
about branch and cut, please refer to [23, 33, 35, 26]. 

5.3.1 Parallel Branch and Cut 

As mentioned above, a branch and bound algorithm may not be fast enough 
for a large problem. The same is unfortunately true for branch and cut algo- 
rithms. Therefore, parallel branch and cut algorithms have been developed. 
A natural idea of parallelizing the branch and cut method is assign each sub- 
problem to a processor. The drawback of this idea is that the workload for 
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different processors could be significantly different, because some subprob- 
lems could be solved with little effort and some could be hard to solve, so we 
would not have good efficiency. A better idea is the master-slave paradigm. 
The master processor maintains the search tree, and delivers subproblems 
to the slave processors when possible. This idea has better efficiency than 
the first one, although it has more inter-processor communications and the 
master processor may be overly busy. For more information about parallel 
branch and cut methods, please refer to [17, 43, 45, 46]. 

In this chapter we introduced the branch and bound and cutting plane 
method for the general mixed integer programs. In the next chapter we will 
introduce our branch and cut algorithm for the data depth problem. 
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Chapter 6 



The Branch and Cut Algorithm 

6.1 The Algorithm 

We develop a branch and cut algorithm for the halfspace depth problem. In 
this algorithm, we first use Chinneck's heuristic algorithm to find a feasible 
solution and set up the upper bound with this solution. We then initialize the 
search tree. The main part is iteratively selecting and processing a problem 
from the tree. After all the problems in the tree are processed, the optimum 
solution of the problem is found. In every iteration, we repeatedly solve an 
LP relaxation and add hitting set cuts. If the subproblem can be solved, it 
will be pruned off, otherwise, it will be divided into two subproblems. 

The top level algorithm is shown in Algorithm 1, and Algorithm 2 and 
Algorithm 3 are subroutines of this algorithm. 
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Algorithm 1 HalfSpaceDepth(S',p) 
Input: A set S of points and a point p in R'*. 

Output: The halfspace depth of p. 

1: Generate an infeasible system Si and the corresponding integer program 

S2 with the input. 

2: Find an MIN IIS COVER c of with Chinneck's heuristic algorithm. 
3: if c == or c —— 1 then 
4: return c 
5: end if 

6: upperbound — c 

7: Initialize the search tree with 5*2 as the root. 
8: while the tree is not empty do 

9: Remove a problem P from the search tree. /* with depth first search 

strategy */ 
10: Call BoundandCut(P). 

11: if P is infeasible or the objective value > upperbound then 
12: continue 

13: else if the subproblem is fathomed then 
14: upperbound — the objective value of P 
15: continue 

16: else / * the subproblem is not fathomed * / 

17: Call Branch(P) and add the new subproblems into the search tree. 
18: end if 
19: end while 

20: return uppc i-homid 
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Algorithm 2 BoundandCut(P) 
Input: An integer program P. 

Output: A solution of P. /* may not be integral */ 
1: Solve a linear program relaxation of P. 
2: if P is infeasible or the objective value > upperbound then 
3: Report the result and return 
4: else if the solution is not integral then 
5: repeat 

6: Generate some hitting set cuts (the details are explained in Sec- 
tion 6.2), add them into P, and resolve P. 
7: until the solution is not sufficiently improved or the solution is integral 

or no cuts can be generated 
8: end if 

9: return the solution 



48 



Algorithm 3 Branch(P) 
Input: An integer program P. 

Output: Subproblems of P 

1: Identify the set of constraints in Si that correspond to the constraints 
in P. 

2: Solve an elastic program of S^. Find the constraint that has the best 
chance to be in the MIN IIS COVER of using the observations in 
Section 4.2 (the details of branching variable selecting are explained in 
Section 6.2). 

3: Let S(, be the binary variable that corresponds to that constraint. Divide 

P into two new subproblems by fixing s^. 
4: return the two new subproblems. 

6.2 Special Techniques in This Algorithm 

Initial Heuristic Algorithm 

Chinneck's heuristic algorithm is very fast and accurate. Most of the time 
this heuristic finds an optimum solution. Hence, we will have a very good 
upper bound at the beginning of the branch and cut algorithm. The heuristic 
in [14] is used, because it is faster according to Chinneck. 

IIS Hitting Set Cuts 

We apply IIS hitting set cuts for the problem. They are problem-specific 
cutting planes. 



49 



Pseudo-Knapsack Technique for Generating Cuts 

In order to generate cuts that are violated by the current solution of the 
LP relaxation, we use a pseudo-knapsack technique to find as many binary 
variables as possible with a summation smaller than 1 (Note that the binary 
variables will become continuous variables in the LP relaxation, and with 
bounds < < 1 for any variable Xi). After solving a LP relaxation, 
the binary variables are ranked according their values in increasing order. 
Select the first k variables {k is maximal) such that the summation of them 
is smaller than 1. Find the IISs in the corresponding constraints of these 
variables. Such an IIS must give a violated cutting plane for the current 
solution of the LP relaxation. 

In fact, identifying the maximum set of binary variables is not a true 
knapsack problem, because in this problem the cost and the value of an item 
(a binary variable) are the same. The greedy method in the above paragraph 
will give the optimal solution of this pseudo-knapsack problem. We can prove 
this by contradiction. Suppose {ai, a2, . . . , is the set of the values of the 
binary variables in increasing order, the greedy method identifies the first k 
items, and a better algorithm identifies a set J of j items (j > k). The sum 
of any k + 1 items in J is greater or equal to Yli=i because if J contains 
any items that are different from the items in {oi, a2, ■ ■ ■ , Ofc+i}, any of those 
different items would be greater or equal to ak+i- Hence, the sum of the 
items in J would be greater than 1, noting that Yl^^l ai > 1. Therefore, a 
better algorithm cannot exist. 
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This technique is used in one of the two hitting set cut generators we 
implemented. 

Branching Variable Selecting Rule 

When selecting the branching variable, we mimic the technique in Chinneck's 
heuristic algorithm. After solving an elastic program of S^, we estimate the 
drop of SINF that each constraint can give by Observation 6 in Section 4.2. 
The constraint b which can give the most significant drop has the best chance 
to be a member of the MIN IIS COVER of Si according to Observation 3 
in [14] . The binary variable Sf, that corresponds to b is selected as the branch- 
ing variable. 

Candidate Problem Selection 

By fixing b, we get two new candidate problems, one with s& = 1, the other 
with Sb — 0. In the problem selection step, a depth first strategy is used 
and the problem with = 1 is selected as the new problem to process. As 
mention in Section 3.4, fixing the binary variable Sb to 1 has the effect of 
removing constraint b from 5*1. As the algorithm continues to dive in the 
problem tree. Si will usually become feasible quickly due to the accuracy of 
Chinneck's algorithm. At that point, the candidate problem will be fathomed 
because the optimum objective value will be 0, an integral solution. This 
strategy will hopefully keep the depth of the search tree small, so then we 
would have a good chance to have small search tree for the whole problem. 
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6.3 A Binary Search Idea 

As we mentioned in Section 3.4, we can not check the accuracy of the solutions 
of the MIPs. However, we can find an accurate solution for a problem by 
solving several MIPs. The idea is as follows: 

6.3.1 New MIP Formulation 

In this idea, the MIP (3.7) needs to be changed to the following form: 



minimize — e 

n 



subject to Sj < guess 

d 

J](4-4)a:, + siM > e 

i=l 

d 

Y,{^2-^p)xi + S2M > e (6.1) 



1=1 



i=l 

s, e {0,1} Vje{l,2,...,n} 

e > 

— oo < Xi < +00 Vi e {1, 2, . . . , (i) 

In this formulation e is a variable, and there is also one more constraint in 
which guess is a value we want to test the depth against. If the optimal value 
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of the objective function is 0, guess is smaller than the depth of point Ap. 
6.3.2 The Binary Search Algorithm 

In this algorithm we need to modify our branch and cut algorithm, Algo- 
rithm 1, before using it as a subroutine. The subroutines of formulating 
MIPs and Chinneck's heuristic are separated from the original Algorithm 1. 
It will only solve a MIP, and it will terminate as soon as it finds a feasible 
solution which gives a nonzero e, because a nonzero e implies that guess is 
no less than the depth of Ap. The binary search algorithm, shown in Al- 
gorithm 4, maintains a cut pool containing the cutting planes generated in 
the early Algorithm 1 subroutines. The cuts will be used as indexed cuts for 
later Algorithm 1 subroutines. 

In this chapter we introduced our branch and cut algorithm for the half- 
space depth problem. Due to the problem in the MIP formulation, we can not 
guarantee an accurate solution with Algorithm 1. Therefore we developed a 
binary search idea to find the accurate solution. In the next chapter we will 
introduce the implementation details of our algorithm. 
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Algorithm 4 HalfSpaceDepthWithBinarySearch(5',p) 
Input: A set S of points and a point p in R'^. 

Output: The halfspace depth of p. 

1: Generate an infeasible system with the input. 

2: Find an MIN IIS COVER c of 5*1 with Chinneck's heuristic algorithm. 
3: if c —— or c —— 1 then 
4: return c 
5: end if 

6: Initialize a cut pool. 
7: upperbound = c; lowerbound = 1 
8: guess — [{upperbound + lowerbound) /2\ 
9: while lowerbound < upperbound do 
10: Formulate an MIP 5*2 with guess 

11: Call HalfSpaceDepth(S'2) and add the newly generated cuts into 

the cut pool. 
12: if e or the MIP is infeasible then 
13: lowerbound — guess + 1 
14: guess — [{upperbound + lowerbound) /2\ 
15: else 

16: upperbound = guess 

17: guess — [{upperbound + lowerbound) /2\ 

18: end if 

19: end while 

20: return upperbound 
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Chapter 7 



Implementation 

Our branch and cut algorithm is implemented with the BCP library from 
the COIN-OR project [1], along with the Osi, Clp and Cgl hbraries from 
this project. In this chapter we will introduce these libraries and how our 
branch and cut algorithm is implemented with them. For the binary search 
algorithm, we just make some adjustments of the branch and cut algorithm, 
and use it as a subroutine. 

7.1 BCP 

BCP is an open source Branch-Cut-Price framework. Pricing is another 
technique for solving integer programs [7, 50]. Our algorithm is a branch 
and cut algorithm, thus we only use the branch and cut part of BCP. BCP 
is a set of C-I--I- classes and functions which manage the search tree. It 
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does not contain any LP solver or cutting plane generator. The Osi (Open 
Solver Interface) library is used as the interface between BCP and an LP 
solver. Clp (COIN-OR linear programming) is used as the LP solver in our 
implementation. Some commercial LP solvers, like CPLEX or Xpress, might 
be faster than Clp, but we want other researchers to have easy access to our 
codes. Of course, it is possible to change the code in order to use other LP 
solvers thanks to Osi. Cgl (Cut Generation Library) is a collection of cut 
generators, which is used to generate cutting planes for BCP. 

BCP only handles minimization problems, since a maximization can be 
easily transferred into a minimization problem. BCP is designed for parallel 
execution in the master slave paradigm; it also supports sequential execution. 
One philosophy of BCP is black box design: the users do not need to know 
the implementation details. To use BCP, in principle we only need to know 
its interfaces and the parameters. If one wants to use some techniques which 
BCP does not provide, one can "open the box" and edit the source code and 
recompile the library; this could be an obstacle because BCP currently lacks 
good documentation. 

7.1.1 Structure of BCP 

BCP has four independent computational modules: Tree Manager, Linear 
Programming (LP), Cut Generator, and Variable Generator. 

The Tree Manager Module This module is the master process. It ini- 
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tializes the problem and manages the search tree. It will keep track of 
all processes and distribute subproblems to the slave processes. 

The Linear Programming (LP) Module This module is the most im- 
portant part of BCP. It is a slave module, and the tree manager module 
can manage several LP modules during the parallel execution. The LP 
module does more than just solving a LP relaxation, but also applies 
cutting plane method, selects branching variables, and branches the 
subproblems. It performs all the branch, bound, and cut jobs. 

The Cut Generator Module This module will generate cutting planes for 
the LP module based on an LP solution. 

The Variable Generator Module This module will generate variables for 
the LP module during the pricing process. 

Only the first two modules are used in our implementation. The cut 
generator module is necessary when the work load or the required memory for 
generating cutting planes is big. In our algorithm the cuts can be generated 
in a short time. Hence, it is better to generate the cuts in the LP module. 
As we mentioned before, different strategies can be applied to one operation 
in the branch and bound method. We need to specify a set of parameters for 
BCP to use some specific techniques that have been implemented in BCP. 
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7.1.2 Parallelization 

The design of the independent modules make parallehzation easy. The mod- 
ules communicate by passing messages. BCP supports both MPI and PVM 
protocol. In our implementation MPI is used. The tree manager process 
maintains a list of subproblems, and assigns a subproblem to an LP process 
when it is idle. A single list of candidate problems is maintained by the tree 
manager process. This is a bottleneck for parallelization. Because of the 
limitations of memory and CPU power of a single node, the tree manager 
process can only manage a limited number of LP and other processes. 

For more details about BCP, please refer to the BCP manual [44], al- 
though some contents are out of date. 

7.2 Implementation Details 

The codes of this algorithm are based on the example BAG [32] written by 
Margot. We also implemented Chinneck's heuristic algorithm [14] with Osi 

and Clp, and two cut generators to generate the hitting set cuts. 

7.2.1 The MPS File Generator 

A C-|— I- class template to generate an MPS (Mathematical Programming 
System, a text file format for linear programs) file is implemented. This 
generator will read the input data from a text file, then generate an MPS 
file of the MIP and an MPS file of the infeasible system. Some ANOVA 
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(Analysis of Variance) applications need duplicated points in the input. In 
this case, the same duplicated data are either all inside the halfspace or all 
outside the halfspace when finding the depth of a point. Therefore, the binary 
variables associated with these data will be one or zero simultaneously. When 
formulating the MIP, we keep only one of the duplicated constraints and 
assign a weight of the number of the duplicated constraints to the associated 
binary variable in the objective function. 

7.2.2 The Cut Generators 

The cut generator will receive the solutions of an LP relaxation from the 
LP process and generate cutting planes based on these solutions. We imple- 
mented two hitting set cuts generators with Cgl. One is based on the idea 
in [9] , the other is based on the idea in the appendix of [8] 

Bremner, Fukuda, and Rosta developed a primal-dual algorithm for the 
halfspace depth problem in [9]. Their algorithm is to find the minimum 
traversal of all MDSs in the input data set. They developed a library to 
generate the MDSs (recall that MDSs are the same as IISs), and that library 
is based on Avis' Lrslib library [5]. We use this MDS generating hbrary to 
generate IISs for our algorithm in the first cut generator. 

In this cut generator, we use the pseudo-knapsack technique in Section 6.2 
to find a set of binary variables. Then we identify the set of points in the 
input data set that correspond to the binary variables. This set of points 
are then used as the input for the MDS generate library to generate a set of 
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IISs. Finally we formulate a set of cutting planes in the form of (3.2), one 
for each IIS found. 

The paper [8] gives an idea to generate a basic infeasible subsystem (BIS) 
of an infeasible system. Given an infeasible system Ax > b where A eW^^'^ 
and 6 e R", the basic infeasible subsystem is an infeasible subsystem of 
cardinality no more than d + 1. To find a basic infeasible subsystem, the 
idea is to apply phase 1 of the two phase simplex method [15] by solving the 
following LP: 

minimize xo 
subject to Ax + Xo > b (7.1) 

After getting the optimal solution, the set of tight constraints corresponds 
to a basic infeasible subsystem of Ax > b. For more details about the basic 
infeasible subsystem, please refer to [8] . The basic infeasible subsystem may 
not be irreducible if it contains a degenerated IIS whose cardinality would 
be smaller than d+ 1, nevertheless it defines a cutting plane. 

At every node of search tree in our algorithm, we have a unique infeasi- 
ble system by removing the inequalities that are associated with the binary 
variables which have been fixed to one. Hopefully we can identify a different 
basic infeasible subsystem for each node. 

BCP does not support global cuts currently. Any cuts added to a sub- 
problem are only available to its children. This is unfortunate for us, since 
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all hitting set cuts are globally valid. On the other hand, keeping too many 
cuts can slow down each node (Bremner, Fukuda, and Rosta [9] observed 
adding all IIS cuts made solving the LP relaxation very slow). 



7.2.3 The Tree Manager Process 

The tree manager process is the central process. It initializes the algorithm 
and manages the search tree. After the algorithm terminates, it will report 
the final results. The tree manager process performs the following functions: 

• Read the integer problem and the infeasible system from MPS files. 

• Compute an initial upper bound for the integer program with the 
heuristic algorithm applied on the infeasible system. 

• Initialize the integer problem. 

• Initialize the search tree. 

• Send the problem to the LP process(es). 

• Receive solutions and update the best one. 

• Receive the data and cuts that subproblems will need in the future. 

• Receive requests from LP process (es) and send a subproblem. 

• Receive branching information, and branch on the processed subprob- 
lem. 
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• Keep tracking the upper bound, and inform the LP process(es) when 
it is updated. 

• Print the final results. 

The work fiow of the tree manager process is shown in Figure 7.1. 
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Figure 7.1: The Tree Manager Process 



7.2.4 The Linear Programming Process 

The LP process will receive a subproblem from the tree manager process and 
perform the branch and cut work on the subproblem. This process performs 
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the following functions: 

• Initialization. 

• Receive the problem from the tree manager process. 

• Set up LP solver. 

• Request new subproblem. 

• Receive a subproblem and related data. 

• Solve an LP relaxation. 

• Test feasibility and fathoming. 

• Generate cutting planes with cut generators and add the cuts into the 
subproblem. 

• When necessary, choose branching object and apply branching strategy. 

• Send results, related data, and cuts to tree manager process. 

The work flow of the linear programming process is shown in Figure 7.2. 

In this process, if the cut generator based on the MDS generator is used, 
cutting planes could always be found based on a solution of the LP relaxation. 
In each iteration we test how much the cutting planes improve the solution. 
If the objective value of the solution is improved by some amount, new cuts 
will be generated and another iteration will be done. The cut generator 
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Figure 7.2: The Linear Programming Process 

based on the phase 1 simplex method is easy to use. At each node one cut is 
generated. Other cut generators in Cgl, like the Gomory cut generator, are 
also used. 

The cutting plane generated in each node are sent back to the TM process, 
and will be sent to the current node's children. 

We also apply the simple rounding heuristic in [32] to the solution of the 
LP relaxation. This heuristic rounds the value of the integral variables to 
integers. In fact this heuristic does not work well most of the time. We only 
apply it when current search tree level is greater than 7 and current iteration 
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is greater than 5. 
7.2.5 Parameters 

The parameters can be passed to the algorithm by a text file. We can specify 
the maximum running time of the algorithm. We can also specify the numer- 
ical precision. In this algorithm we set the granularity and integer tolerance 
to 10~^^. We can also choose the branching strategy. In this algorithm, we 
need to disable the default strong branching strategy in order to use the 
greedy branching rule. To use the candidate problem selection strategy in 
Section 6.2, we set the tree search strategy to the depth first search and the 
child preference to dive down. Many other parameters can also be specified. 
The BCP documentation [2] contains the full hst of parameters. 

In this chapter we introduced the BCP hbrary and the details of our 
implementation. In the next chapter we will present experimental results. 
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Chapter 8 



Computational Experiments 

Our algorithms have been tested on a Myrinet/4-way cluster that consists of 
dual socket SunFire x4100 nodes which are populated with 2.6 GHz dual-core 
Opteron 285 SE processors and 4 GB RAM per core. We set the CPU time 
limit to 60 minutes in these tests. For readability, we relegate most of the 
raw experimental results to an appendix, and report only a summary in this 
chapter. 

8.1 Numerical Issues 

In practice, if the value of the e in the MIP is too small compared with the 
coefficients of the constraints, the linear programming solver would round 
it to zero. Our solution is scaling the data items, and making the norms 
similar and relatively small. For a few data sets, the depth values reported 
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by our algorithm with different strategies or parameters are different (with a 
difference of 1). This could be caused by bugs in our codes or bugs in BCP, 
but we also suspect this is due to some numerical issue. 

8.2 Results for Random Generated Data 

The data sets tested in this section are a subset of the data sets used in [9] , 
and they are randomly generated. For every data set we compute the depth 
of the first point, which is the origin. For all the tests in this section, the e 
of the MIP (3.7) is set to 0.00001. Comparing with the results of the primal- 
dual algorithm and the binary search algorithm, the depth values computed 
with our branch and cut algorithm (with BIS cut generator) are accurate. 
Therefore the e is small enough. 

8.2.1 Comparing Branching Rules and Tree Search Strate- 
gies 

We first test our algorithm with the first hitting set cut generator in Sec- 
tion 7.2.2, the one implemented with the MDS generating hbrary, and with 
the greedy branching rule (see Section 6.2). Table A. 3, Table A. 4, Table A. 5, 
and Table A. 6 give the performance on 4 group of data sets. We generate 10 
cuts in one iteration of the LP process. If the objective value is improved by 
0.001, the LP process will do another iteration. When the MDS cut gener- 
ator is used, most of the CPU time is spent on cutting plane generation. If 
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more cuts are generated in one iteration, the algorithm will be slowed down, 
but will be more memory efficient. 

Table A. 7, Table A. 8, Table A.9, and Table A. 10 give the performance 
when the default strong branching rule in BCP is used. In Figure 8.1 and 
Figure 8.2 we compare the performance of the default strong with the greedy 
branching rules. Figure 8.1 and Figure 8.2 show that strong branching gives 
better performance for most problems, probably because less search tree 
nodes are processed. For many difficult problems, the greedy branching 
works better. In those difficult cases, greedy branching spent much less time 
on branching, although more search tree nodes would be processed. 

Table A. 11, Table A. 12, Table A. 13, and Table A.14 give the performance 
of the best first candidate problem selecting rule. With this strategy, the 
performance is similar to that with depth first strategy. In these tests, strong 
branching and MDS cut generator are applied. 

8.2.2 Comparing Cut Generators 

Table A. 15, Table A. 16, Table A. 17, and Table A. 18 give the performance 
of our algorithm compiled with the second hitting set cut generator in Sec- 
tion 7.2.2, the one implemented with basic infeasible system idea. In these 
tests, the default strong branching in BCP is used. With the BIS cut genera- 
tor, less CPU time will be used to generate cuts, and the algorithm has better 
overall performance, although the search tree is larger. The BIS cut gener- 
ator uses floating point arithmetic, the same as the rest of the system. The 
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Figure 8.1: Comparison of difTerent branching rules 



MDS cut generator uses exact arithmetic which is required for the Lrslib. 
This is a factor which slows down the MDS cut generator. 

In fact many cuts are generated repeatedly in the optimization process. 
The pseudo-knapsack idea in Section 6.2 can force the algorithm to generate 
a different cut each time, but the performance turns out to be worse, and 
more search tree nodes will be processed. With the pseudo-knapsack idea, if 
the algorithm generates a cut with a probability less than 1 on each node, the 
performance will be improved to some extent, although still worse than that 
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without pseudo-knapsack. We also observe that the pseudo-knapsack idea 
can make the algorithm faster when the greedy branching is applied. This 
suggests that the pseudo-knapsack idea interferes with the strong branching 
rule. The reason might be that this idea makes the values of the binary 
variables in the solution of LP relaxation closer to each other. 

In Figure 8.3 and Figure 8.4 we compare the performance of our algorithm 
with the two different cutting plane generators. The general cut generators 
in Cgl can barely generate cuts for our algorithm, and do not improve the 
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Figure 8.3: Comparison of diflFerent cutting plane generators 



performance. 



8.2.3 Comparing Algorithms 

The performance of the binary search algorithm in Section 6.3.2 is given 
in Table A.19, Table A.20, Table A.21, and Table A.22. The time in the 
tables is the total time of solving all MIPs during the binary process. The 
performance of the primal-dual algorithm on the same data sets is given in 
Table A.23, Table A.24, Table A.25, and Table A.26. The binary search 
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Figure 8.4: Comparison of different cutting plane generators 



algorithm does not perform too badly, but the primal-dual algorithm is very 
slow on some hard problems. In Figure 8.5 and Figure 8.6 we compare the 
performance of the binary search algorithm, the primal-dual algorithm, and 
the branch and cut algorithm. The branch and cut algorithm works best 
most of the time. The performance of the binary search algorithm is actually 
quite fast (as well as being more numerically stable). Sometimes the binary 
search algorithm even works better than the branch and cut algorithm. The 
reason is that the MIPs for the binary search algorithm are usually easier to 
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solve, and the tricks used in the binary search algorithm also help to speed 
up the algorithm. In contrast, the primal-dual algorithm can be slow on large 
problems. 




Figure 8.5: Comparison of different algorithms 



8.2.4 Parallel Execution 

All the above tests are done with the sequential version of our algorithm. 
Some tests of parallel version of the branch and cut algorithm are given in 
Figure 8.7. Two data sets are used to test the algorithm. The performance 
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witfi one processor is the performance of the sequential version of the algo- 
rithm. When two processors are applied, one of them is used for the slave 
process (LP process), and when four processors are applied, three of them are 
used for the slave process. So we expect a speedup of 3 for four processors, 
7 for eight processors, and so forth. The dashed line in the figure indicates 
the linear speedup with respect to number of LP processes. From the figure 
we can see that the speedup is almost linear. 



74 



A 80 points in dimension 5, depth 24 Performance of the 

O 80 points in dimension 1 0, deptii 1 parallel execution 

1h - 




1 2 4 8 16 

number of processors 
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8.3 Results for ANOVA Data 

The ANOVA data tested in this section are randomly generated according to 
a scheme in [36]. The data points are in the form oiyi^,„i^. In this manuscript 
Mizera gives the following description: 



The index i^. corresponds to a fc-th factor. The values (1,2,..., J^) 
of this index are called factor levels (apparently, [it] is only this 1^ 
which is technically of interest here) and correspond to the vari- 
ants of the treatment that the factor represents. For instance, 
a factor may correspond to a kind of soil (say, there are three 
different ones involved in the experiment); another factor may 
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represent the type of the fertihzer used (say, five different fertil- 
izers used in the experiment); a datapoint then records the yield 
on the plot (or a plot; there may be more plots of this kind) 
corresponding to the particular combination of soil and fertilizer. 

There arc some duplicated points in every data set. For every data set, 
the depth of the origin is computed. Table 8.1 gives a comparison of the 
performance of our algorithm with different integer program formulations, 
the simple MIP as (3.7) and the weighted MIP as described in Section 7.2.1. 
The sequential algorithm is used for these tests. The upper bound of the 
number of duplications in the data set is given in the third column. Prom 
the table we can see that the algorithm using the weighted MIP is much 
faster, because there are many fewer rows in the MIP. 



76 



Point # 


Dimension 


Duplication 


Depth 


Simple MIP 


Weighted MIP 


32 


8 


2 


5 


0.62 


0.19 


32 


8 


2 


10 


5.89 


2.06 


32 


8 


2 


7 


0.90 


0.24 


32 


8 


2 


7 


0.54 


0.31 


32 


8 


2 


4 


0.14 


0.03 


48 


8 


3 


5 


0.11 


0.08 


48 


8 


3 


11 


4.39 


0.82 


48 


8 


3 


10 


2.17 


0.62 


48 


8 


3 


9 


2.04 


0.28 


48 


8 


3 


13 


27.72 


2.00 


64 


8 


4 


11 


1.92 


0.22 


64 


8 


4 


17 


91.89 


2.98 


64 


8 


4 


18 


200.82 


3.48 


64 


8 


4 


15 


30.85 


0.87 


64 


8 


4 


16 


28.94 


1.60 


72 


12 


2 


13 


147.59 


22.19 


72 


12 


2 


18 


807.20 


250.77 


72 


12 


2 


14 


85.94 


33.52 


72 


12 


2 


17 


529.16 


69.92 


72 


12 


2 


20 


outmem 


469.81 


108 


12 


3 


26 


outmem 


519.49 


108 


12 


3 


24 


outmem 


264.20 


108 


12 


3 


24 


outmem 


341.87 


108 


12 


3 


29 


outmem 


1435.35 


108 


12 


3 


22 


outmem 


105.49 


144 


12 


4 


33 


outmem 


1238.99 


144 


12 


4 


39 


outmem 


1760.49 


144 


12 


4 


40 


outmem 


1527.83 


144 


12 


4 


33 


outmem 


544.95 


144 


12 


4 


29 


outtime 


330.57 



Table 8.1: Performance with different integer program formulations 



77 



Chapter 9 



Conclusions 

9.1 Summary of the Work 

We noted that the halfspace depth problem is equivalent to the maximum in- 
feasible subsystem problem. We reviewed a heuristic algorithm suggested by 
Chinneck. We also reviewed the branch and cut paradigm for MIP problems. 
We developed a branch and cut algorithm for the halfspace depth problem. 
Based on this algorithm, we developed a second binary search algorithm for 
the halfspace depth problem. Two cut generators were also developed for 
our algorithm. 

We evaluated different strategies for the branch and cut algorithm. We 
also compared the branch and cut algorithm with the binary search algo- 
rithm and the primal-dual algorithm, and concluded that the branch and 
cut algorithm is the fastest, although with some numerical issues. The bi- 
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nary search is slower, but still faster than the primal-dual algorithm and 
more stable. Fast cutting plane generators are important, because the BIS 
cut generator improves the performance dramatically. The strong branching 
rule is faster than the greedy branching rule on most of the tests, but the 
greedy branching rule is faster on many hard problems (i.e. those with large 
depth) . 

On ANOVA data sets, the duplicated constraints are removed with the 
weighted MIP formulation. With this modification, the algorithm solved all 
the problems we tested. 

9.2 Open Problems and Future Work 

In some applications, only the center of the data set is interesting. With the 
current algorithm we have to compute the depth of every data item in order 
to find the center. A fast algorithm for finding the center is open for future 
work. 

The idea for finding a proper e described in Section 3.4 is not practi- 
cal. Another open problem is a method to find a practical e for MIP (3.7). 
There may be an idea to solve an MIP based on the strict inequalities of 
system (3.1). Then we do not need to consider e. The binary search algo- 
rithm does not require a value for e, and it can report a proper value for 
e. Ironically, this algorithm finds a proper value after solving the halfspace 
depth problem. 
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As we noticed in Section 8.2.2, the pseudo-knapsack idea slows down the 
strong branching when using the BIS cut generator. An idea for reducing 
redundant cut generation in the BIS cut generator that does not interfere 
with the strong branching would be interesting. 
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Appendix A 
Testing Results 



Num 


The number of points in the input data set 


Dim 


The dimension of the data items 


Sue 


Success or not 


TD 


The depth of the search tree 


UB 


The upper bound found by the heuristic algorithm or the best 




solution found by the algorithm if the algorithm does not 




finish successfully 


Nod 


The number of processed nodes in the search tree 


Tim 


The CPU time used to find the solution 


Dep 


The optimal objective value of the problem 


outtime 


Running out of time 


outmem 


Running out of memory 


(*) 


The depth value with this note are one larger than the correct 




value 



Table A.l: Abbreviations used in this chapter 



d5 A group of data sets in dimension 5 
dlO A group of data sets in dimension 10 
n50 A group of data sets, each set consists of 50 points 
sd5 A group of data sets in dimension 5; the points are symmetric around 
the origin in each set 

Table A. 2: Data sets used for the tests 
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A.l Results of the Branch and Bound algo- 
rithm 

A. 1.1 Results of the Greedy Branching and MDS Cut 
Generator 



Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


20 


5 


yes 


0.36 


9 


183 


4 


4 


20 


5 


yes 


0.48 


10 


157 


5 


5 


30 


5 


yes 


2.19 


12 


553 


6 


6 


30 


5 


yes 


1.00 


9 


317 


4 


4 


30 


5 


yes 


5.49 


13 


1255 


8 


8(*) 


40 


5 


yes 


1.07 


10 


91 


5 


5 


40 


5 


yes 


26.77 


16 


3101 


10 


10 


40 


5 


yes 


48.20 


16 


5857 


11 


11(*) 


50 


5 


yes 


63.03 


16 


7505 


10 


10 


50 


5 


yes 


277.55 


20 


27061 


14 


14 


50 


5 


yes 


341.83 


20 


25825 


15 


15 


60 


5 


outmem 








16 




60 


5 


outmem 








19 




60 


5 


yes 


257.61 


19 


15191 


13 


13 


70 


5 


outmem 








21 




70 


5 


outmem 








20 




70 


5 


outmem 








23 




80 


5 


outtime 








20 




80 


5 


outmem 








22 




80 


5 


outmem 








25 




90 


5 


outmem 








29 




90 


5 


outmem 








21 




90 


5 


outmem 








24 





Table A. 3: Performance with the greedy branching rule, data set: d5 
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Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


50 


10 


yes 


78.80 


16 


5431 


5 


5 


50 


10 


yes 


6.74 


13 


377 


3 


3 


50 


10 


yes 


604.58 


19 


45741 


7 


7 


60 


10 


yes 


96.16 


16 


5315 


5 


5 


60 


10 


yes 


325.15 


17 


15939 


6 


6 


60 


10 


yes 


32.29 


15 


1639 


5 


4 


70 


10 


yes 


478.59 


17 


16019 


6 


6 


70 


10 


outtime 








10 




70 


10 


yes 


1457.65 


18 


43105 


7 


7 


80 


10 


outtime 








10 




80 


10 


outtime 








8 




80 


10 


outtime 








14 




90 


10 


outtime 








18 




90 


10 


outtime 








16 




90 


10 


outtime 








13 




110 


10 


outtime 








19 




120 


10 


outimet 








21 




120 


10 


outtime 








25 




130 


10 


outtime 








28 





Table A. 4: Performance with the greedy branching rule, data set: dlO 
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Num. 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


50 


3 


ves 


41.45 


21 


2949 


19 




50 


3 




6.57 


13 


373 


12 


12 


50 


3 


ves 


38.29 


19 


2743 


18 


18 


50 


4 


ves 


120.65 


19 


7203 


16 


16 


50 


4 


ves 


26.70 


16 


1719 


12 


12 


50 


4 


ves 


28.55 


15 


2025 


11 


11 


50 


5 


ves 


18.46 


12 


853 


9 


9 


50 


5 


ves 


197.99 


20 


20725 


14 


13 


50 


5 


ves 


352.73 


20 


41845 


14 


13 


50 


6 


outmem 








14 




50 


6 


ves 


449.99 


19 


58455 


12 


11 


50 


6 


ves 


162.71 


17 


14229 


10 


10 


50 


7 


ves 


1261.95 


22 


155239 


13 


11 


50 


7 


ves 

J CO 


42.53 


15 


3169 


7 


7 


50 


7 


ves 

J CO 


189.65 


17 


20883 


8 


8 


50 


8 


ves 

J CO 


2835.12 


21 


290617 


12 


11 


50 


8 


ves 

JCO 


8.85 


13 


891 


4 


4 


50 


8 


ves 


32.14 


14 


2677 


5 


5 


50 


9 


yes 


53.85 


15 


3871 


6 


5 


50 


9 


yes 


1590.48 


20 


149065 


10 


9 


50 


9 


yes 


641.22 


19 


66863 


8 


8 


50 


10 


yes 


27.92 


15 


1983 


5 


4 


50 


10 


yes 


128.03 


17 


9551 


6 


5 


50 


10 


yes 


237.98 


18 


16823 


6 


6 


50 


11 


yes 


369.11 


19 


24865 


7 


6 


50 


11 


yes 


104.49 


18 


7681 


5 


5 


50 


11 


yes 


1.45 


13 


83 


2 


2 


50 


12 


yes 


1.55 


14 


87 


2 


2 


50 


12 


yes 


10.49 


15 


581 


3 


3 


50 


12 


yes 


44.48 


17 


2705 


4 


4 



Table A. 5: Performance with the greedy branching rule, data set: n50 
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Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


16 


5 


yes 


0.49 


11 


307 


5 


4 


16 


5 


yes 


0.78 


11 


573 


5 


4 


26 


5 


yes 


6.21 


14 


2631 


8 


6 


26 


5 


yes 


9.50 


14 


3903 


8 


8(*) 


36 


5 


yes 


96.77 


19 


23611 


13 


12 


36 


5 


yes 


99.37 


19 


25147 


13 


12 


46 


5 


yes 


450.37 


23 


48993 


16 


16 


46 


5 


yes 


510.80 


23 


67557 


18 


15(*) 


56 


5 


outmem 








21 




56 


5 


outmem 








20 




66 


5 


outmem 








24 




66 


5 


outmem 








25 




76 


5 


outmem 








30 




76 


5 


outmem 








29 




86 


5 


outmem 








35 




86 


5 


outmem 








35 





Table A. 6: Performance with the greedy branching rule, data set: sd5 
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A. 1.2 Results of the Strong Branching and MDS Cut 
Generator and Depth First Search 



1 N Hill 


T)i m 

j_yiiii 




J. 1111 


TD 

i i_y 


NnH 


TIR 


Den 


on 





yes 


n 9/1 


Q 

y 


00 


A 


A 




K 



yes 


n '^Q 
u.oy 


Q 
y 


77 








ou 


o 


yes 


i.yo 


1 n 

iU 


9AQ 
z^y 


fi 
u 


fi 

u 


ou 


o 


yes 


n on 

u.yu 


Q 

y 


1 '^0 
±oz 


A 


A 


ou 





yes 


A AA 


1 

io 




c 
o 









yes 


n 7/1 


Q 

y 


•^/l 


p; 






An 





yes 


99 81 


iO 


1 fin7 


1 n 

iU 


1 n 

iU 


An 


o 


yes 


7Q 


1 fi 




1 1 
1 1 




^n 
ou 


K 
O 


yes 


fin 

ou. / o 


lO 


o / ou 


1 n 

iU 


1 n 

iU 


^n 
ou 


K 
O 


yes 


9QA 81 


9n 
zu 


1 89'?7 


1 A 


1 A 






yea 


451 83 


21 


22364 


15 


15 


60 


5 


yes 


880.08 


22 


33964 


16 


16(*) 


60 


5 


outmem 








19 




60 


5 


yes 


318.49 


19 


11297 


13 


13 


70 


5 


outmem 








21 




70 


5 


outmem 








20 




70 


5 


outmem 








23 




80 


5 


outtime 








20 




80 


5 


outmem 








22 




80 


5 


outmem 








25 




90 


5 


outmem 








29 




90 


5 


outmem 








21 




90 


5 


outmem 








24 





Table A. 7: Performance with strong branching, data set: d5 
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Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


50 


10 


yes 


68.78 


16 


2592 


5 


5 


50 


10 


yes 


6.11 


13 


184 


3 


3 


50 


10 


yes 


551.35 


18 


21750 


7 


7 


60 


10 


yes 


90.50 


16 


2533 


5 


5 


60 


10 


yes 


295.14 


17 


7697 


6 


6 


60 


10 


yes 


69.53 


16 


2022 


5 


4 


70 


10 


yes 


452.90 


17 


7743 


6 


6 


70 


10 


outtime 








10 




70 


10 


yes 


1415.18 


19 


21407 


7 


7 


80 


10 


outtime 








10 




80 


10 


outtime 








8 




80 


10 


outtime 








15 




90 


10 


outtime 








18 




90 


10 


outmem 








16 




90 


10 


outtime 








14 




110 


10 


outtime 








20 




120 


10 


outtime 








22 




120 


10 


outtime 








25 




130 


10 


outtime 








28 





Table A. 8: Performance with strong branching, data set: dlO 
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Num. 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


50 


3 




18.23 


19 


618 


19 




50 


3 




1.87 


10 


53 


12 


12 


50 


3 




28.40 


20 


1300 


18 


18 


50 


4 




93.95 


20 


3812 


16 


16 


50 


4 


ves 


20.84 


16 


950 


12 


12 


50 


4 


ves 


29.37 


15 


1534 


11 


11 


50 


5 


ves 


8.26 


13 


279 


9 


9 


50 


5 


ves 


201.13 


20 


11376 


14 


13 


50 


5 


ves 


250.46 


20 


16958 


14 


13 


50 


6 


ves 


1096 93 


22 


60545 


14 


14 


50 


6 


ves 


614.84 


19 


45113 


12 


11 


50 


6 


ves 


158.91 


17 


7491 


10 


10 


50 


7 


ves 


1525.51 


22 


103429 


13 


11 


50 


7 


ves 


35.34 


14 


1410 


7 


7 


50 


7 


ves 


173.09 


16 


10087 


8 


8 


50 


8 


ves 


3280.20 


22 


187925 


12 


11 


50 


8 


ves 


7.80 


13 


414 


4 


4 


50 


8 


ves 


29.56 


14 


1295 


5 


5 


50 


9 


yes 


117.13 


16 


4786 


6 


5 


50 


9 


yes 


2764.54 


21 


142913 


10 


9 


50 


9 


yes 


574.78 


18 


32367 


8 


8 


50 


10 


yes 


66.55 


16 


2443 


5 


4 


50 


10 


yes 


201.70 


17 


8021 


6 


5 


50 


10 


yes 


228.01 


17 


8064 


6 


6 


50 


11 


yes 


908.46 


19 


31971 


7 


6 


50 


11 


yes 


93.19 


17 


3541 


5 


5 


50 


11 


yes 


1.27 


13 


37 


2 


2 


50 


12 


yes 


1.38 


14 


39 


2 


2 


50 


12 


yes 


9.00 


15 


252 


3 


3 


50 


12 


yes 


39.67 


17 


1223 


4 


4 



Table A. 9: Performance with strong branching, data set: n50 
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Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


16 


5 


yes 


0.35 


10 


130 


5 


4 


16 


5 


yes 


0.44 


10 


175 


5 


4 


26 


5 


yes 


5.98 


14 


1393 


8 


6 


26 


5 


yes 


8.28 


15 


1821 


8 


8(*) 


36 


5 


yes 


85.98 


20 


12477 


13 


12 


36 


5 


yes 


93.01 


19 


11764 


13 


12 


46 


5 


yes 


596.31 


23 


42126 


16 


16 


46 


5 


yes 


477.06 


26 


36778 


18 


15(*) 


56 


5 


outmem 








21 




56 


5 


outmem 








20 




66 


5 


outmem 








24 




66 


5 


outmem 








25 




76 


5 


outmem 








30 




76 


5 


outmem 








30 




86 


5 


outmem 








35 




86 


5 


outmem 








37 





Table A. 10: Performance with strong branching, data set: sd5 
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A. 1.3 Results of the Strong Branching and MDS Cut 
Generator and Best First Search 



Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


20 


5 


ves 


0.24 


9 


65 


4 


4 


20 


5 


ves 


0.49 


9 


96 


5 


5 


30 


5 


ves 


2.10 


10 


264 


6 


6 


30 


5 


ves 


0.89 


9 


152 


4 


4 


30 


5 


ves 


4.34 


13 


513 


8 




40 


5 


ves 


0.72 


8 


31 


5 


5 


40 


5 


ves 


22.31 


16 


1540 


10 


10 


40 


5 


ves 


45.00 


16 


3841 


11 


11(*) 


50 


5 


ves 


61.19 


15 


3754 


10 


10 


50 


5 


ves 


244.03 


20 


16455 


14 


14 


50 


5 


yes 


400.39 


22 


20545 


15 


15 


60 


5 


outmem 








16 




60 


5 


outmem 








19 




60 


5 


yes 


306.79 


18 


12123 


13 


13 


70 


5 


outmem 








22 




70 


5 


outmem 








20 




70 


5 


outmem 








23 




80 


5 


outmem 








21 




80 


5 


outmem 








22 




80 


5 


outmem 








25 




90 


5 


outmem 








29 




90 


5 


outmem 








21 




90 


5 


outmem 








24 





Table A. 11: Performance with best first tree search strategy, data set: d5 
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Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


50 


10 


yes 


68.97 


16 


2592 


5 


5 


50 


10 


yes 


6.10 


13 


184 


3 


3 


50 


10 


yes 


553.80 


18 


21750 


7 


7 


60 


10 


yes 


90.75 


16 


2533 


5 


5 


60 


10 


yes 


296.55 


17 


7697 


6 


6 


60 


10 


yes 


54.45 


15 


2380 


5 


4 


70 


10 


yes 


453.98 


17 


7743 


6 


6 


70 


10 


outtime 








10 




70 


10 


yes 


1416.57 


19 


21407 


7 


7 


80 


10 


outtime 








10 




80 


10 


outtime 








8 




80 


10 


outtime 








15 




90 


10 


outtime 








18 




90 


10 


outtime 








16 




90 


10 


outtime 








14 




110 


10 


outtime 








20 




120 


10 


outtime 








22 




120 


10 


outtime 








25 




130 


10 


outtime 








28 





Table A. 12: Performance with best first tree search strategy, data set: dlO 
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Num. 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


50 


3 




15.49 


20 


588 


19 




50 


3 




2.06 


12 


69 


12 


12 


50 


3 




30.43 


20 


1411 


18 


18 


50 


4 


vcs 


78.82 


20 


3469 


16 


16 


50 


4 


no 












50 


4 




25.89 


15 


1456 


11 


11 


50 


5 


ves 


7.37 


13 


261 


9 


9 


50 


5 


no 












50 


5 


ves 


290.62 


19 


23463 


14 


13 


50 


6 


no 












50 


6 


ves 


519.63 


19 


44841 


12 


11 


50 


6 


ves 


147.15 


16 


7203 


10 


10 


50 


7 


ves 


1240.54 


21 


112366 


13 


11 


50 


7 


ves 


35.02 


14 


1411 


7 


7 


50 


7 


ves 


171.64 


16 


10087 


8 


8 


50 


8 


ves 


3448.78 


22 


250539 


12 


11 


50 


8 


ves 


7.75 


13 


414 


4 


4 


50 


8 


ves 


29.25 


14 


1295 


5 


5 


50 


9 


yes 


78.47 


16 


5030 


6 


5 


50 


9 


yes 


2182.89 


21 


142552 


10 


9 


50 


9 


yes 


571.29 


18 


32368 


8 


8 


50 


10 


yes 


39.57 


15 


2376 


5 


4 


50 


10 


yes 


116.86 


17 


7404 


6 


5 


50 


10 


yes 


228.21 


17 


8064 


6 


6 


50 


11 


yes 


592.47 


19 


31130 


7 


6 


50 


11 


yes 


93.12 


17 


3541 


5 


5 


50 


11 


yes 


1.27 


13 


37 


2 


2 


50 


12 


yes 


1.38 


14 


39 


2 


2 


50 


12 


yes 


9.00 


15 


253 


3 


3 


50 


12 


yes 


39.61 


17 


1222 


4 


4 



Table A. 13: Performance with best first tree search strategy, data set: n50 



97 



Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


16 


5 


yes 


0.34 


9 


180 


5 


4 


16 


5 


yes 


0.54 


10 


345 


5 


4 


26 


5 


yes 


3.46 


13 


1050 


8 


6 


26 


5 


yes 


8.04 


15 


1839 


8 


8(*) 


36 


5 


yes 


89.81 


20 


13844 


13 


12 


36 


5 


yes 


86.77 


19 


11768 


13 


12 


46 


5 


no 












46 


5 


yes 


551.51 


25 


51826 


18 


15(*) 


56 


5 


outmem 








21 




56 


5 


outmem 












66 


5 


outmem 








24 




66 


5 


outmem 








25 




76 


5 


outmem 








30 




76 


5 


outmem 








30 




86 


5 


outmem 








35 




86 


5 


outmem 








37 





Table A. 14: Performance with best first tree search strategy, data set: sd5 
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A. 1.4 Results of the Strong Branching and BIS Cut 
Generator 



Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 

It 


20 


5 


ves 

J 


0.16 


8 


63 


4 


4 


20 


5 


ves 


0.44 


9 


181 


5 


5 


30 


5 


ves 


1.07 


10 


377 


6 


6 


30 


5 


ves 


0.17 


8 


57 


4 


4 


30 


5 


ves 


3.83 


13 


1447 


8 


7 


40 


5 


ves 


0.55 


10 


166 


5 


5 


40 


5 


ves 


14.12 


15 


4571 


10 


10 


40 


5 


ves 


22.79 


16 


7511 


11 


10 


50 


5 


ves 


15.65 


15 


4420 


10 


10 


50 


5 


ves 


89.68 


20 


25492 


14 


14 


50 


5 


yes 


129.29 


20 


37698 


15 


15 


60 


5 


yes 


177.52 


22 


45043 


16 


15 


60 


5 


yes 


366.92 


25 


95820 


19 


17 


60 


5 


yes 


72.54 


18 


17240 


13 


13 


70 


5 


yes 


1246.22 


28 


287853 


22 


21 


70 


5 


yes 


777.33 


27 


178746 


20 


19 


70 


5 


yes 


1689.58 


29 


393702 


23 


23 


80 


5 


yes 


1221.53 


28 


248762 


21 


20 


80 


5 


yes 


990.78 


28 


198836 


22 


19 


80 


5 


yes 


2875.70 


31 


603491 


25 


24 


90 


5 


outmem 








29 




90 


5 


yes 


1395.70 


28 


250491 


21 


21 


90 


5 


yes 


2927.97 


31 


539796 


24 


24 



Table A. 15: Performance with BIS cut generator, data set: d5 
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Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


50 


10 


yes 


3.97 


15 


807 


5 


5 


50 


10 


yes 


0.22 


12 


43 


3 


3 


50 


10 


yes 


42.17 


18 


9220 


7 


7 


60 


10 


yes 


4.27 


14 


753 


5 


5 


60 


10 


yes 


15.43 


16 


2722 


6 


6 


60 


10 


yes 


3.50 


14 


620 


5 


4 


70 


10 


yes 


17.04 


16 


2680 


6 


6 


70 


10 


yes 


796.99 


21 


132672 


10 


10 


70 


10 


yes 


52.06 


19 


8267 


7 


7 


80 


10 


yes 


903.12 


21 


130127 


10 


10 


80 


10 


yes 


164.53 


21 


22439 


8 


8 


80 


10 


outtime 








15 




90 


10 


outmem 








18 




90 


10 


outmem 








16 




90 


10 


outmem 








14 




110 


10 


outmem 








20 




120 


10 


outmem 








22 




120 


10 


outmem 








25 




130 


10 


outmem 








28 





Table A. 16: Performance with BIS cut generator, data set: dlO 
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Num. 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dpp 


50 


3 


vcs 


20.79 


22 


6600 


19 


18 


50 


3 


VCR 


4.30 


15 


1286 


12 


12 


50 


3 


VPS 


17.79 


21 


5676 


18 


18 


50 


4 


VPS 


50.47 


21 


15062 


16 


16 


50 


4 


VPS 


14.03 


16 


4086 


12 


12 


50 


4 


VCS 


9.42 


15 


2774 


11 


11 


50 


5 


VPS 


9.81 


14 


2750 


9 


9 


50 


5 


VPS 


85.41 


20 


24138 


14 


13 


50 


5 


VPS 


72.49 


19 


21080 


14 


13 


50 


6 


VPS 


268.88 


21 


75044 


14 


14 


50 


6 


VPS 


86.39 


19 


23175 


12 


11 


50 


6 


VPS 


37.20 


16 


9929 


10 


10 


50 


7 


VPS 


333.07 


21 


87192 


13 


11 


50 


7 


VPS 


9.28 


14 


2282 


7 


7 


50 


7 


VPS 


21.02 


15 


5227 


8 


8 


50 


8 


VPS 


447.77 


21 


112751 


12 


11 


50 


8 


VPS 


0.57 


11 


123 


4 


4 


50 


8 


VPS 


1.99 


12 


457 


5 


5 


50 


9 


yps 


7.96 


15 


1700 


6 


5 


50 


9 


yes 


324.54 


20 


77184 


10 


9 


50 


9 


yes 


66.37 


18 


15087 


8 


8 


50 


10 


yes 


3.96 


15 


801 


5 


4 


50 


10 


yes 


8.39 


16 


1774 


6 


5 


50 


10 


yes 


14.29 


16 


2996 


6 


6 


50 


11 


yes 


54.82 


18 


11364 


7 


6 


50 


11 


yes 


5.75 


16 


1078 


5 


5 


50 


11 


yes 


0.03 


1 


2 


2 


2 


50 


12 


yes 


0.05 


5 


7 


2 


2 


50 


12 


yes 


0.34 


14 


59 


3 


3 


50 


12 


yes 


1.64 


15 


293 


4 


4 



Tablp A. 17: Pprformancp with BIS cut gpnprator, data spt: n50 
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Num 


Dim 


Sue 


Tim 


TD 


Nod 


UB 


Dep 


16 


5 


yes 


0.18 


10 


78 


5 


4 


16 


5 


yes 


0.23 


11 


108 


5 


4 


26 


5 


yes 


3.37 


14 


1356 


8 


6 


26 


5 


yes 


2.50 


14 


1016 


8 


7 


36 


5 


yes 


38.78 


18 


12909 


13 


12 


36 


5 


yes 


37.01 


21 


13183 


13 


12 


46 


5 


yes 


188.91 


23 


59063 


16 


16 


46 


5 


yes 


139.06 


25 


43252 


18 


14 


56 


5 


yes 


727.58 


21 


202080 


21 


19 


56 


5 


yes 


882.08 


29 


242383 


24 


20 


66 


5 


yes 


2126.74 


31 


536278 


24 


22 


66 


5 


yes 


2570.02 


32 


648729 


25 


24 


76 


5 


outmem 








30 




76 


5 


outmem 








29 




86 


5 


outmem 








35 




86 


5 


outmem 








35 





Table A. 18: Performance with BIS cut generator, data set: sd5 
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A. 2 Results of the Binary Search Algorithm 



Num. 


Dim 


oUC 


1 im 


Jjep 


on 





yes 


n nc 
U.Uo 


4 


on 





yes 


n o7 





Qn 





yes 


n 





on 





yes 


n no 


4 


Qn 





yes 


Q Qn 


7 
1 


/in 





yes 


n Qc; 


r 



An 


i3 


yes 


1 Q OQ 


1 n 
iU 


A n 





yes 


ov vi 


iU 


oU 





yes 


O/l ov 


iU 


c;n 





yes 


oof; 


1 /I 
14 


c;n 





yes 


o4y. / Do 


io 


fin 


c; 




348 43 




60 


5 


yes 


2389.03 


17 


60 


5 


yes 


121.78 


13 


70 


5 


outtime 






70 


5 


outtime 






70 


5 


outtime 






80 


5 


outtime 






80 


5 


outtime 






80 


5 


outtime 






90 


5 


outtime 






90 


5 


outtime 






90 


5 


outtime 







Table A. 19: Performance of the binary search, data set: d5 
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Num 


Dim 


Sue 


Tim 


Dep 


50 


10 


yes 


1.45 


5 


50 


10 


yes 


0.09 


3 


50 


10 


yes 


19.13 


7 


60 


10 


yes 


1.68 


5 


60 


10 


yes 


6.22 


6 


60 


10 


yes 


1.21 


4 


70 


10 


yes 


7.05 


6 


70 


10 


yes 


855.21 


10 


70 


10 


yes 


23.84 


7 


80 


10 


yes 


1062.49 


10 


80 


10 


yes 


103.47 


8 


80 


10 


outtime 






90 


10 


outtime 






90 


10 


outtime 






90 


10 


outtime 






110 


10 


outtime 






120 


10 


outtime 






120 


10 


outtime 






130 


10 


outtime 







Table A. 20: Performance of the binary search, data set: dlO 
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Num. 


Dim 


Sue 


Tim 


Dep 


50 


3 


ves 


48.63 


18 


50 


3 


VPS 


8.10 


12 


50 


3 


ves 


71.81 


18 


50 


4 


ves 


217.37 


16 


50 


4 


ves 


20.78 


12 


50 


4 


ves 


15.93 


11 


50 


5 


ves 


12.13 


9 


50 


5 


ves 


135.55 


13 


50 


5 


ves 


164.87 


13 


50 


6 


ves 


509.81 


14 


50 


6 


ves 


171.37 


11 


50 


6 


ves 


54.10 


10 


50 


7 


ves 


448.17 


11 


50 


7 


ves 


5.38 


7 


50 


7 


ves 


18.76 


8 


50 


8 


ves 


451.09 


11 


50 


8 


ves 


0.22 


4 


50 


8 


ves 


0.99 


5 


50 


9 


yes 


3.95 


5 


50 


9 


yes 


213.69 


9 


50 


9 


yes 


48.82 


8 


50 


10 


yes 


1.70 


4 


50 


10 


yes 


2.54 


5 


50 


10 


yes 


5.54 


6 


50 


11 


yes 


32.68 


6 


50 


11 


yes 


1.94 


5 


50 


11 


yes 


0.01 


2 


50 


12 


yes 


0.01 


2 


50 


12 


yes 


0.12 


3 


50 


12 


yes 


0.52 


4 



Table A.21: Performance of the binary search, data set: n50 
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Num 


Dim 


Sue 


Tim 


Dep 


16 


5 


yes 


0.16 


4 


16 


5 


yes 


0.15 


4 


26 


5 


yes 


1.72 


6 


26 


5 


yes 


1.97 


7 


36 


5 


yes 


116.26 


12 


36 


5 


yes 


99.88 


12 


46 


5 


yes 


960.48 


16 


46 


5 


yes 


457.56 


14 


56 


5 


yes 


3585.18 


19 


56 


5 


outtime 






66 


5 


outtime 






66 


5 


outtime 






76 


5 


outtime 






76 


5 


outtime 






86 


5 


outtime 






86 


5 


outtime 







Table A. 22: Performance of the binary search, data set: sd5 
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.3 Results of the Primal-Dual Algorithm 



Num. 


Dim 


oUC 


1 im 


Jjep 


on 


r 



yes 


n nn 

u.uy 


4 




c: 



yes 


n nn 

u.yu 


cr 
D 


Qn 





yes 


1 np; 
i.yo 





on 


r 




yes 


n 

O.Z ( 


4 


Qn 





yes 


n 
o.U 


/ 


A n 





yes 


1 1 n 
i.iy 





A n 





yes 


1/1/1 nn 
144. yy 


1 n 
iU 


A n 




yes 


7n 

oz. /y 


1 n 
iU 


c^n 





yes 


do. DO 


1 n 
iU 


cn 


r 



yes 


1 non /I c; 


1 /I 
14 


oU 


r 



yes 


04D0.4U 


io 


fin 


c; 




7nK n7K 




60 


5 


yes 


8810.34 


17 


60 


5 


yes 


99.21 


13 


70 


5 


outtime 






70 


5 


yes 


26274.70 


19 


70 


5 


outtime 






80 


5 


yes 


13232.58 


20 


80 


5 


yes 


3899.48 


19 


80 


5 


outtime 






90 


5 


outtime 






90 


5 


yes 


12821.73 


21 


90 


5 


outtime 







Table A. 23: Performance of the primal-dual algorithm, data set: d5 
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Num 


Dim 


Sue 


Tim 


Dep 


50 


10 


yes 


45.05 


5 


50 


10 


yes 


2.41 


3 


50 


10 


yes 


231.06 


7 


60 


10 


yes 


16.16 


5 


60 


10 


yes 


35.98 


6 


60 


10 


yes 


5.50 


4 


70 


10 


yes 


26.22 


6 


70 


10 


outtime 






70 


10 


yes 


40.88 


7 


80 


10 


yes 


38915.29 


10 


80 


10 


yes 


121.22 


8 


80 


10 


outtime 






90 


10 


outtime 






90 


10 


outtime 






90 


10 


outtime 






110 


10 


outtime 






120 


10 


outtime 






120 


10 


outtime 






130 


10 


outtime 







Table A. 24: Performance of the primal-dual algorithm, data set: dlO 
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Num 


Dim 


Sue 


Tim 


Dep 


50 


3 


ves 


47.94 


18 


50 


3 


VCR 


1.30 


12 


50 


3 


ves 


60.31 


18 


50 


4 


ves 


287.33 


16 


50 


4 


ves 


16.07 


12 


50 


4 


ves 


25.47 


11 


50 


5 


ves 


9.53 


9 


50 


5 


ves 


278.84 


13 


50 


5 


ves 


514.03 


13 


50 


6 


ves 


16052.87 


14 


50 


6 


ves 


1031.67 


11 


50 


6 


ves 


279.75 


10 


50 


7 


ves 


4797.68 


11 


50 


7 


ves 


12.94 


7 


50 


7 


ves 


41.91 


8 


50 


8 


ves 


16336.03 


11 


50 


8 


ves 


1.33 


4 


50 


8 


ves 


4.71 


5 


50 


9 


yes 


4.23 


5 


50 


9 


yes 


10425.83 


9 


50 


9 


yes 


2440.02 


8 


50 


10 


yes 


8.03 


4 


50 


10 


yes 


7.82 


5 


50 


10 


yes 


112.04 


6 


50 


11 


yes 


171.47 


6 


50 


11 


yes 


14.90 


5 


50 


11 


yes 


2.77 


2 


50 


12 


yes 


1.80 


2 


50 


12 


yes 


2.41 


3 


50 


12 


yes 


4.28 


4 



Table A. 25: Performance of the primal-dual algorithm, data set: n50 
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Num 


Dim 


Sue 


Tim 


Dep 


16 


5 


yes 


0.32 


4 


16 


5 


yes 


0.24 


4 


26 


5 


yes 


3.48 


6 


26 


5 


yes 


5.08 


7 


36 


5 


yes 


394.41 


12 


36 


5 


yes 


715.92 


12 


46 


5 


yes 


23908.49 


16 


46 


5 


yes 


4093.35 


14 


56 


5 


outtime 






56 


5 


outtime 






66 


5 


outtime 






66 


5 


outtime 






76 


5 


outtime 






76 


5 


outtime 






86 


5 


outtime 






86 


5 


outtime 







Table A. 26: Performance of the primal-dual algorithm, data set: sd5 
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